Efficient Processing of RDF Queries with Nested Optional Graph Patterns in an RDBMS
نویسندگان
چکیده
Relational technology has shown to be very useful for scalable Semantic Web data management. Numerous researchers have proposed to use RDBMSs to store and query voluminous RDF data using SQL and RDF query languages. In this article, we study how RDF queries with the socalled well-designed graph patterns and nested optional patterns can be efficiently evaluated in an RDBMS. We propose to extend relational databases with a novel relational operator, nested optional join (NOJ), that is more efficient than left outer join in processing nested optional patterns of well-designed graph patterns. We design three efficient algorithms to implement the new operator in relational databases: (1) nested-loops NOJ algorithm (NL-NOJ); (2) sortmerge NOJ algorithm (SM-NOJ); and (3) simple hash NOJ algorithm (SH-NOJ). Based on a real-life RDF dataset, we demonstrate the efficiency of our algorithms by comparing them with the corresponding left outer join implementations and explore the effect of join selectivity on the performance of our algorithms.
منابع مشابه
Relational Nested Optional Join for Efficient Semantic Web Query Processing
Increasing amount of RDF data on the Web drives the need for its efficient and effective management. In this light, numerous researchers have proposed to use RDBMSs to store and query RDF annotations using the SQL and SPARQL query languages. The first few attempts at SPARQL-to-SQL translation revealed non-trivial challenges related to correctness and efficiency of such translation in the presen...
متن کاملSemantics Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns
The Semantic Web has recently gained tremendous momentum due to its great potential for providing a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. While Resource Description Framework (RDF) is a W3C recommended langauge for representing data over the Semantic Web, SPARQL is an emerging W3C query language for RDF data. Although...
متن کاملHPRD: A High Performance RDF Database
In this paper a high performance storage system for RDF documents is introduced. The system employs optimized index structures for RDF data and efficient RDF query evaluation. The index scheme consists of 3 types of indices. Triple index manages basic RDF triples by dividing original RDF graph into several sub-graphs. Path index manages frequent RDF path patterns for long path query performance...
متن کاملCascading map-side joins over HBase for scalable join processing
One of the major challenges in large-scale data processing with MapReduce is the smart computation of joins. Since Semantic Web datasets published in RDF have increased rapidly over the last few years, scalable join techniques become an important issue for SPARQL query processing as well. In this paper, we introduce the Map-Side Index Nested Loop Join (MAPSIN join) which combines scalable index...
متن کاملPath discovery by Querying the federation of Relational Database and RDF Graph
The class of queries for detecting path is an important as those can extract implicit binary relations over the nodes of input graphs. Most of the path querying languages used by the RDF community, like property paths in W3C SPARQL 1.1 and nested regular expressions in nSPARQL are based on the regular expressions. Federated queries allow for combining graph patterns and relational database that...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. J. Semantic Web Inf. Syst.
دوره 4 شماره
صفحات -
تاریخ انتشار 2008